European Ad Hoc Retrieval Experiments with Hummingbird SearchServerTM at CLEF 2005

نویسنده

  • Stephen Tomlinson
چکیده

Hummingbird participated in the 4 monolingual information retrieval tasks (Bulgarian, French, Hungarian and Portuguese) of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2005. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant documents (with high precision) in a particular document set. We conducted diagnostic experiments with different techniques for matching word variations and handling stopwords. We found that the experimental stemmers significantly increased mean average precision for the 4 languages. Analysis of individual topics found that the algorithmic Bulgarian and Hungarian stemmers encountered some unanticipated stopword collisions. A comparison to an experimental 4-gram technique suggested that Hungarian stemming would further benefit from decompounding. A blind feedback technique which significantly increased mean average precision for some languages was also significantly detrimental to the rank of the first relevant retrieved for one language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical and Algorithmic Stemming Compared for 9 European Languages with Hummingbird SearchServerTM at CLEF 2003

Hummingbird participated in the monolingual information retrieval tasks of the Cross-Language Evaluation Forum (CLEF) 2003: for natural language queries in 9 European languages (German, French, Italian, Spanish, Dutch, Finnish, Swedish, Russian and English) find all the relevant documents (with high precision) in the CLEF 2003 document sets. For each language, SearchServer scored higher than th...

متن کامل

Hungarian Monolingual Retrieval at CLEF 2005

We describe our official runs for the ad hoc monolingual task in Hungarian for CLEF 2005. We conducted experiments with four stemmers of varying impact. The experiments indicate that stemmers focusing on noun inflection are as effective as more broadly oriented stemmers, and that extensive stemming is especially beneficial for Hungarian monolingual retrieval.

متن کامل

Ad-hoc Mono- and Bilingual Retrieval Experiments at the University of Hildesheim

This paper reports on our participation in CLEF 2005‘s ad-hoc multi-lingual retrieval track. The ad-hoc task introduced Bulgarian and Hungarian as new languages. Our experiments focus on the two new languages. Naturally, no relevance assessments are available for these collections yet. Optimization was mainly based on French data from last year. Based on experience from last year, one of our ma...

متن کامل

European Web Retrieval Experiments with Hummingbird SearchServer™ at CLEF 2005

Hummingbird participated in the mixed monolingual retrieval task of the WebCLEF Track of the Cross-Language Evaluation Forum (CLEF) 2005. In this task, the system was given 547 known-item queries from 11 languages (134 Spanish, 121 English, 59 Dutch, 59 Portuguese, 57 German, 35 Hungarian, 30 Danish, 30 Russian, 16 Greek, 5 Icelandic and 1 French). The goal was to find the desired page in the 8...

متن کامل

An Evaluation of Greek-English Cross Language Retrieval within the CLEF Ad-Hoc Bilingual Task

This article describes an experimental investigation on the use of resources from the web on a common Natural Language Problem (NLP) problem that of Word Sense Disambiguation (WSD). In particular we use our disambiguation experiments with statistical query translation on a Greek-English cross language retrieval system using Google’s n-grams. Results from our participation on the Ad-Hoc TEL trac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005